A Marketplace for Web Scale Analytics and Text Annotation Services

نویسندگان

  • Johannes Kirschnick
  • Torsten Kilias
  • Holmer Hemsen
  • Alexander Löser
  • Peter Adolphs
  • Heiko Ehrig
  • Holger Düwiger
چکیده

We present MIA, a data marketplace which enables massive parallel processing of data from the Web. End users can combine both text mining and database operators in a structured query language called MIAQL. MIA offers many cost savings through sharing text data, annotations, built-in analytical functions and third party text mining applications. Our demonstration showcases MIAQL and its execution on the platform for the example of analyzing political campaigns.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AnnoMarket - Multilingual Text Analytics at Scale on the Cloud

AnnoMarket is an open platform for cloud-based text analytics services and language resources acquisition. Providers of text analytics services and language resources can deploy and monetize their components via the platform, while users can utilize such available resources in multiple languages and in various domains in an on-demand, pay-as-you-go manner. The AnnoMarket platform is deployed on...

متن کامل

Linguistically Light Lexical Extensions for Ontologies

An increasing number of enterprises are beginning to include semantic web ontologies into their Information Extraction (IE) and Text Analytics (TA) applications. This can be challenging for a TA group wishing to avail of semantic web ontologies due to the manual effort of retargeting and tailoring language resources within the TA system to a new domain to meet customer needs. A lightweight lexi...

متن کامل

How to build a WebFountain: An architecture for very large-scale text analytics

WebFountain is a platform for very large-scale text analytics applications. The platform allows uniform access to a wide variety of sources, scalable system-managed deployment of a variety of document-level “augmenters” and corpus-level “miners,” and finally creation of an extensible set of hosted Web services containing information that drives end-user applications. Analytical components can b...

متن کامل

Interoperability and Customisation of Annotation Schemata in Argo

The process of annotating text corpora involves establishing annotation schemata which define the scope and depth of an annotation task at hand. We demonstrate this activity in Argo, a Web-based workbench for the analysis of textual resources, which facilitates both automatic and manual annotation. Annotation tasks in the workbench are defined by building workflows consisting of a selection of ...

متن کامل

A case for automated large-scale semantic annotation

This paper describes Seeker, a platform for large-scale text analytics, and SemTag, an application written on the platform to perform automated semantic tagging of large corpora. We apply SemTag to a collection of approximately 264 million web pages, and generate approximately 434 million automatically disambiguated semantic tags, published to the web as a label bureau providing metadata regard...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014